Instead of storing the account_id in the Products table, we can store it in a separate table, so each individual value of that attribute occupies a separate row. This new table Contacts implements a many-to-many relationship between Products and Accounts:

Creating Contacts table

When the table has foreign keys referencing two tables, it’s called an intersection table. This implements a many-to-many relationship between the two referenced tables. That is, each product may be associated through the intersection table to multiple accounts, and likewise, each account may be associated with multiple products. See the Intersection table Entity-Relationship Diagram below.

Intersection table Entity-Relationship Diagram

Let’s see how using an intersection table resolves all the problems we saw in the previous lesson.

Querying products by account and the other way around#

In order to query the attributes of all products for a given account, it’s more straightforward to join the Products table with the Contacts table.

Let’s run the code in the playground below. You can also update the code written below.

Querying the attributes of all products for a given account

Some people resist queries that contain a JOIN, thinking that they perform poorly. However, this query uses indexes much better than the solution shown earlier in the previous lesson.

Querying account details is likewise easy to read and easy to optimize. It uses indexes for JOIN efficiently, instead of an esoteric use of “regular expressions.”

Let’s run the query below to see the effect of this query on the database.

Querying the attributes of all accounts for a given product

Note: The records for account_id having the values 12, 23, and 34 are already available in the database. If we want to query some more data, we can add the data by using the standard syntax.

The same is the case with the Products table. The records for 123, 345, and 567 are already available. If we want to query some more data, we can add the data by using the standard syntax.

Making aggregate queries#

Let’s run the given query in the following widget. We can also try to use some other aggregate queries that use functions like COUNT(), SUM(), and AVG(). The following example returns the number of accounts per product:

Querying number of accounts per product

The number of products per account is just as simple:

Number of products per account

Note: Try to run the query given above. You can edit the code and see the effect on the database.

Other more sophisticated reports are possible too, such as the product with the greatest number of accounts:

Product with the greatest number of accounts

Updating contacts for a specific product#

We can add or remove entries in the list by inserting or deleting rows in the intersection table. Each product reference is stored in a separate row in the Contacts table, so we can add or remove them one at a time.

Inserting data for a specific product

Note: If we want to insert data other than what is written in the playground, we must first check the available data by querying the specific table.

Now, let’s try to delete the same data that we added in the previous playground.

Removing data for a specific product

Validating product IDs#

We can use a foreign key to validate the entries against a set of legitimate values in another table. We declare that Contacts.account_id references Accounts.account_id, and therefore rely on the database to enforce referential integrity. Now we can be sure that the intersection table contains only account IDs that exist.

We can also use SQL data types to restrict entries. For example, if the entries in the list should be valid INTEGER or DATE values and we declare the column using those data types, we can be sure that all entries are legal values of that type (not nonsense entries like “banana”).

Let’s add “banana” instead of “456” product_id and see what happens.

Inserting invalid data for product_id

Choosing a separator character#

We don’t use any separator character since we store each entry on a separate row. In this way, there’s no ambiguity even if the entries themselves contain commas or other characters that may be used as separators.

List length limitations#

Since each entry is in a separate row in the intersection table, the list is limited only by the number of rows that can physically exist in one table. If it’s appropriate to limit the number of entries, we should enforce the policy in our application using the count of entries rather than the collective length of the list.

Other advantages of the intersection table#

Creating an index on Contacts.account_id makes performance better than matching a substring in a comma-separated list. Declaring a foreign key on a column implicitly creates an index on that column in many database brands.

We can also create additional attributes for each entry by adding columns to the intersection table. For example, we can record the date a contact was added for a given product, or we can specify an attribute to differentiate primary contacts from secondary contacts. We can’t do this in a comma-separated list.

Antipattern: Format Comma-Separated Lists
Synopsis: Naive Trees
Mark as Completed
Report an Issue